• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) LDA¿Í WMD ±â¹ÝÀÇ °ø°£ º¯È¯À» ÀÌ¿ëÇÑ È¿°úÀûÀÎ ¹®¼­ Ŭ·¯½ºÅ͸µ ¹æ¹ý
¿µ¹®Á¦¸ñ(English Title) An Efficient Document Clustering Method using Space Transformation based on LDA and WMD
ÀúÀÚ(Author) ±è¿ë´ã   Á¤¼º¿ø   Yongdam Kim   Sungwon Jung                          
¿ø¹®¼ö·Ïó(Citation) VOL 48 NO. 09 PP. 1052 ~ 1060 (2021. 09)
Çѱ۳»¿ë
(Korean Abstract)
±âÁ¸ÀÇ TF-IDF ±â¹ÝÀÇ ¹®¼­ Ŭ·¯½ºÅ͸µ ±â¹ýÀº ¹®¼­ÀÇ ¹®¸Æ Á¤º¸ÀÎ co-occurrence¿Í wordorder¿¡ ´ëÇÑ Á¤º¸¸¦ ÃæºÐÈ÷ È°¿ëÇÏÁö ¸øÇÏ°í,
¿µ¹®³»¿ë
(English Abstract)
The existing TF-IDF-based document clustering methods do not properly exploit the contextual information of documents, i.e., co-occurence and word-order, and tend to degrade the performance due to the curse of dimensionality. To overcome these problems, the techniques such as a weighted average of word embedding vectors or Word Mover's Distance (WMD) have been proposed. The performance of the techniques is good at document classification, but not a document clustering that needs to group documents. In this study, we define a document group as a topic document using LDA, the document group's representative document, and solve the existing problem by calculating the WMD based on the topic document. However, since WMD requires a large amount of computation, we propose a space transformation method that shows a good performance while reducing the computation cost by mapping each document to a low-dimensional space in which each axis means WMD value from each topic document.
Å°¿öµå(Keyword) ¹®¼­ Ŭ·¯½ºÅ͸µ   word mover's distance   °ø°£ º¯È¯ ±â¹ý   ¿öµå ÀÓº£µù   document clustering   word mover's distance   space                          
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå